fix test/ops/self_attention.py #5

liulog · 2025-08-18T11:38:01Z

我在实现 kvcache 后，发现 Prefill 阶段得到的 token 正确，Decode 阶段得到的 token 不对，通过查看张量，发现 self-attention 部分有问题，最终定位到 softmax 有问题，发现我实现的 self-attention 算子的 softmax 的部分不对（当 qlen != kvlen 时，也就是用 kvcache 时），但是通过了 test/ops/self-attention.py 的测试，在我的视线中增加 past_len = total_len - seqlen 之后，可以正确推理，但是通不过 self-attention 的测试了，以此推论 test/ops/self-attention.py 也有问题。

分析：

# 之前的实现
temp_mask = torch.ones(L, S, dtype=torch.bool).tril(diagonal=0)
# 修改后的实现    
temp_mask = torch.ones(S, S, dtype=torch.bool).tril(diagonal=0)[-L:, ]

之前测试中的 mask 内容：

mask 应该具有的正确内容：

修改这一部分的逻辑之后，self-attention 和 infer 的 CI 测试都可以通过了。

下图是通过 CI 的截图：

PanZezhong1725 · 2025-08-19T02:46:07Z

十分感谢，问题已复现

test/ops/self_attention.py

fix test/ops/self_attention.py

00d651a

PanZezhong1725 requested changes Aug 19, 2025

View reviewed changes

test/ops/self_attention.py Outdated Show resolved Hide resolved

fix test/ops/self_attention.py

896616b

PanZezhong1725 approved these changes Aug 19, 2025

View reviewed changes

PanZezhong1725 merged commit 2945515 into InfiniTensor:main Aug 19, 2025
0 of 2 checks passed

liulog deleted the fix-self_attention-test branch August 19, 2025 09:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix test/ops/self_attention.py #5

fix test/ops/self_attention.py #5

Uh oh!

liulog commented Aug 18, 2025

Uh oh!

PanZezhong1725 commented Aug 19, 2025

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

fix test/ops/self_attention.py #5

fix test/ops/self_attention.py #5

Uh oh!

Conversation

liulog commented Aug 18, 2025

Uh oh!

PanZezhong1725 commented Aug 19, 2025

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants